智能论文笔记

Panning for gold: Lessons learned from the platform-agnostic automated detection of political content in textual data

Mykola Makhortykh , Ernesto de León , Aleksandra Urman , Clara Christner , Maryna Sydorova , Silke Adam , Michaela Maier , Teresa Gil-Lopez

分类：自然语言处理

2022-07-01

关于在线信息行为的数据的日益增长的可用性为政治传播研究带来了新的可能性。但是，这些数据的数量和多样性使它们难以分析，并提示需要开发自动化内容方法，这些方法依赖于广泛的自然语言处理技术（例如机器学习或基于神经网络）。在本文中，我们讨论如何使用这些技术来检测不同平台的政治内容。使用三个验证数据集，其中包括来自在线平台的各种政治和非政治文本文档，我们系统地比较了依赖词典，监督机器学习或神经网络的三组检测技术的性能。我们还使用大型检测模型的大集合（n = 66）检查了不同数据预处理模式（例如，驱动和停止词）对这些技术的低成本实现的影响。我们的结果表明，预处理对模型性能的影响有限，与基于神经网络和机器学习模型所获得的嘈杂数据的最佳结果相比，基于嘈杂的数据的基于词典模型的更强性能。

translated by 谷歌翻译

Dynamic Sparse Network for Time Series Classification: Learning What to "see''

Qiao Xiao , Boqian Wu , Yu Zhang , Shiwei Liu , Mykola Pechenizkiy , Elena Mocanu , Decebal Constantin Mocanu

分类：机器学习 | 人工智能

2022-12-19

The receptive field (RF), which determines the region of time series to be ``seen'' and used, is critical to improve the performance for time series classification (TSC). However, the variation of signal scales across and within time series data, makes it challenging to decide on proper RF sizes for TSC. In this paper, we propose a dynamic sparse network (DSN) with sparse connections for TSC, which can learn to cover various RF without cumbersome hyper-parameters tuning. The kernels in each sparse layer are sparse and can be explored under the constraint regions by dynamic sparse training, which makes it possible to reduce the resource cost. The experimental results show that the proposed DSN model can achieve state-of-art performance on both univariate and multivariate TSC datasets with less than 50\% computational cost compared with recent baseline methods, opening the path towards more accurate resource-aware methods for time series analyses. Our code is publicly available at: https://github.com/QiaoXiao7282/DSN.

translated by 谷歌翻译

You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets

Tianjin Huang , Tianlong Chen , Meng Fang , Vlado Menkovski , Jiaxu Zhao , Lu Yin , Yulong Pei , Decebal Constantin Mocanu , Zhangyang Wang , Mykola Pechenizkiy

分类：机器学习

2022-11-28

Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i.e., untrained networks). However, the presence of such untrained subnetworks in graph neural networks (GNNs) still remains mysterious. In this paper we carry out the first-of-its-kind exploration of discovering matching untrained GNNs. With sparsity as the core tool, we can find \textit{untrained sparse subnetworks} at the initialization, that can match the performance of \textit{fully trained dense} GNNs. Besides this already encouraging finding of comparable performance, we show that the found untrained subnetworks can substantially mitigate the GNN over-smoothing problem, hence becoming a powerful tool to enable deeper GNNs without bells and whistles. We also observe that such sparse untrained subnetworks have appealing performance in out-of-distribution detection and robustness of input perturbations. We evaluate our method across widely-used GNN architectures on various popular datasets including the Open Graph Benchmark (OGB).

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

An Empirical Evaluation of Posterior Sampling for Constrained Reinforcement Learning

Danil Provodin , Pratik Gajane , Mykola Pechenizkiy , Maurits Kaptein

分类：机器学习

2022-09-08

我们研究了在约束强化学习中有效探索的后验抽样方法。或者，对于现有算法，我们提出了两种简单的算法，这些算法在统计上更有效，更简单地实现和计算便宜。第一种算法基于CMDP的线性公式，第二算法利用CMDP的鞍点公式。我们的经验结果表明，尽管具有简单性，但后取样可实现最先进的表现，在某些情况下，采样明显优于乐观算法。

translated by 谷歌翻译

Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost

Lu Yin , Shiwei Liu , Fang Meng , Tianjin Huang , Vlado Menkovski , Mykola Pechenizkiy

分类：机器学习 | 人工智能

2022-08-23

彩票（LTS）能够发现准确而稀疏的子网，可以隔离训练以匹配密集网络的性能。合奏并行，是机器学习中最古老的预期技巧之一，可以通过结合多个独立模型的输出来提高性能。但是，在LTS背景下，合奏的好处将被稀释，因为合奏并没有直接导致更稀疏的子网，而是利用其预测来做出更好的决定。在这项工作中，我们首先观察到，直接平均相邻学习的子网的权重显着提高了LT的性能。在这一观察结果的鼓励下，我们进一步提出了另一种方法，通过简单的插值策略通过迭代幅度修剪来识别的子网执行“合奏”。我们称我们的方法彩票池。与幼稚的合奏相比，每一个子网都不会带来性能，彩票池比原始LTS产生的稀疏子网稀疏得多，而无需任何额外的培训或推理成本。在CIFAR-10/100和Imagenet上的各种现代体系结构中，我们表明我们的方法在分布和分发场景方面都取得了显着的性能。令人印象深刻的是，用VGG-16和RESNET-18进行评估，生产的子网稀疏的子网在CIFAR-100上优于原始LTS，在CIFAR-100-C上高达1.88％，而CIFAR-100-C则高于2.36％。最终的致密网络超过了CIFAR-100的预训练密集模型，在CIFAR-100-C上超过2.22％。

translated by 谷歌翻译

PencilNet: Zero-Shot Sim-to-Real Transfer Learning for Robust Gate Perception in Autonomous Drone Racing

Huy Xuan Pham , Andriy Sarabakha , Mykola Odnoshyvkin , Erdal Kayacan

分类：机器人

2022-07-28

在自主和移动机器人技术中，主要挑战之一是对环境的坚强感知，通常是未知和动态的，例如自主无人机赛车。在这项工作中，我们提出了一种新型的基于神经网络的感知方法，用于赛车门检测 - 铅笔网 - 依赖于铅笔过滤器顶部的轻质神经网络骨架。这种方法统一了对盖茨的2D位置，距离和方向的预测。我们证明我们的方法对于不需要任何现实世界训练样本的零射击SIM到运行转移学习有效。此外，与最先进的方法相比，在快速飞行下通常看到的照明变化非常强大。一组彻底的实验证明了这种方法在多种挑战的情况下的有效性，在多种挑战性的情况下，无人机在不同的照明条件下完成了各种轨道。

translated by 谷歌翻译

Memory-free Online Change-point Detection: A Novel Neural Network Approach

Zahra Atashgahi , Decebal Constantin Mocanu , Raymond Veldhuis , Mykola Pechenizkiy

分类：机器学习 | 人工智能

2022-07-08

检测数据分布突然变化的变更点检测（CPD）被认为是时间序列分析中最重要的任务之一。尽管关于离线CPD的文献广泛，但无监督的在线CPD仍面临主要挑战，包括可扩展性，超参数调整和学习限制。为了减轻其中一些挑战，在本文中，我们提出了一种新颖的深度学习方法，用于从多维时间序列中无监督的在线CPD，名为Adaptive LSTM-AUTOENOCODER变更点检测（ALACPD）。 ALACPD利用了基于LSTM-AutoEncoder的神经网络来执行无监督的在线CPD。它连续地适应了传入的样本，而无需保留先前接收的输入，因此没有内存。我们对几个实际时间序列的CPD基准进行了广泛的评估。我们表明，在时间序列细分的质量方面，ALACPD平均在最先进的CPD算法中排名第一，并且就估计更改点的准确性而言，它与表现最好。 ALACPD的实现可在Github \ footNote {\ url {https://github.com/zahraatashgahi/alacpd}}上在线获得。

translated by 谷歌翻译

More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity

Shiwei Liu , Tianlong Chen , Xiaohan Chen , Xuxi Chen , Qiao Xiao , Boqian Wu , Mykola Pechenizkiy , Decebal Mocanu , Zhangyang Wang

分类：计算机视觉

2022-07-07

自视觉变压器（VIT）出现以来，变形金刚在计算机视觉世界中迅速发光。卷积神经网络（CNN）的主要作用似乎受到越来越有效的基于变压器的模型的挑战。最近，几个先进的卷积模型以当地但大量注意机制的驱动的大型内核进行反击，显示出吸引力的性能和效率。尽管其中一个（即Replknet）令人印象深刻地设法将内核大小扩展到31x31，而性能提高，但随着内核大小的持续增长，性能开始饱和，与Swin Transformer等高级VIT的缩放趋势相比。在本文中，我们探讨了训练大于31x31的极端卷积的可能性，并测试是否可以通过策略性地扩大卷积来消除性能差距。这项研究最终是从稀疏性的角度施加极大核的食谱，该核心可以将内核平滑地扩展到61x61，并且性能更好。我们提出了稀疏的大内核网络（SLAK），这是一种纯CNN架构，配备了51x51个核，可以与最先进的层次变压器和现代探测器架构（如Convnext和Repleknet and Replknet and Replknet and Replknet and Replinext and Replknet and Replinext and Convnext and Replentical conternels cor相同或更好在成像网分类以及典型的下游任务上。我们的代码可在此处提供https://github.com/vita-group/slak。

translated by 谷歌翻译

Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training

Lu Yin , Vlado Menkovski , Meng Fang , Tianjin Huang , Yulong Pei , Mykola Pechenizkiy , Decebal Constantin Mocanu , Shiwei Liu

分类：机器学习 | 人工智能

2022-05-30

关于稀疏神经网络训练（稀疏训练）的最新研究表明，通过从头开始训练本质上稀疏的神经网络可以实现绩效和效率之间的令人信服的权衡。现有的稀疏训练方法通常努力在一次跑步中找到最佳的稀疏子网，而无需涉及任何昂贵的密集或预训练步骤。例如，作为最突出的方向之一，动态稀疏训练（DST）能够通过在训练过程中迭代发展稀疏拓扑来实现竞争性训练的竞争性能。在本文中，我们认为最好分配有限的资源来创建多个低损失的稀疏子网并将其超级置于更强的基因，而不是完全分配所有资源以找到单个子网络。为了实现这一目标，需要两个Desiderata：（1）在一个培训过程中有效生产许多低损失的子网，即所谓的廉价门票，仅限于用于密集培训的标准培训时间；（2）将这些廉价的门票有效地超级为一个更强的子网，而无需超越约束参数预算。为了证实我们的猜想，我们提出了一种新颖的稀疏训练方法，称为\ textbf {sup-tickets}，可以在单个稀疏到较小的训练过程中同时满足上述两个desiderata。在CIFAR-10/100和Imagenet上的各种现代体系结构中，我们表明，SUP-Tickets与现有的稀疏训练方法无缝集成，并显示出一致的性能提高。

translated by 谷歌翻译